Propagating Uncertainty in POMDP Value Iteration with Gaussian Processes

نویسندگان

Eric Tuttle

Zoubin Ghahramani

چکیده

In this paper, we describe the general approach of trying to solve Partially Observable Markov Decision Processes with approximate value iteration. Methods based on this approach have shown promise for tackling larger problems where exact methods are doomed, but we explain how most of them suffer from the fundamental problem of ignoring information about the uncertainty of their estimates. We then suggest a new method for value iteration which uses Gaussian processes to form a Bayesian representation of the uncertain POMDP value function. We evaluate this method on several standard POMDPs and obtain promising

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robot Planning in Partially Observable Continuous Domains

We present a value iteration algorithm for learning to act in Partially Observable Markov Decision Processes (POMDPs) with continuous state spaces. Mainstream POMDP research focuses on the discrete case and this complicates its application to, e.g., robotic problems that are naturally modeled using continuous state spaces. The main difficulty in defining a (belief-based) POMDP in a continuous s...

متن کامل

Bounded-Parameter Partially Observable Markov Decision Processes

The POMDP is considered as a powerful model for planning under uncertainty. However, it is usually impractical to employ a POMDP with exact parameters to model precisely the real-life situations, due to various reasons such as limited data for learning the model, etc. In this paper, assuming that the parameters of POMDPs are imprecise but bounded, we formulate the framework of bounded-parameter...

متن کامل

Gaussian Processes for Fast Policy Optimisation of POMDP-based Dialogue Managers

Modelling dialogue as a Partially Observable Markov Decision Process (POMDP) enables a dialogue policy robust to speech understanding errors to be learnt. However, a major challenge in POMDP policy learning is to maintain tractability, so the use of approximation is inevitable. We propose applying Gaussian Processes in Reinforcement learning of optimal POMDP dialogue policies, in order (1) to m...

متن کامل

A Method for Speeding Up Value Iteration in Partially Observable Markov Decision Processes

We present a technique for speeding up the convergence of value iteration for par tially observable Markov decisions processes (POMDPs). The underlying idea is similar to that behind modified policy iteration for fully observable Markov decision processes (MDPs). The technique can be easily incor porated into any existing POMDP value it eration algorithms. Experiments have been conducted on ...

متن کامل

Monte Carlo Value Iteration for Continuous-State POMDPs

Partially observable Markov decision processes (POMDPs) have been successfully applied to various robot motion planning tasks under uncertainty. However, most existing POMDP algorithms assume a discrete state space, while the natural state space of a robot is often continuous. This paper presents Monte Carlo Value Iteration (MCVI) for continuous-state POMDPs. MCVI samples both a robot’s state s...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Propagating Uncertainty in POMDP Value Iteration with Gaussian Processes

نویسندگان

چکیده

منابع مشابه

Robot Planning in Partially Observable Continuous Domains

Bounded-Parameter Partially Observable Markov Decision Processes

Gaussian Processes for Fast Policy Optimisation of POMDP-based Dialogue Managers

A Method for Speeding Up Value Iteration in Partially Observable Markov Decision Processes

Monte Carlo Value Iteration for Continuous-State POMDPs

عنوان ژورنال:

اشتراک گذاری